Comparing Exact Bayesian and BIC Markov Order Classifiers

نویسندگان

  • H. S. Bhat
  • N. Kumar
چکیده

We use an exact Bayesian calculation to design classifiers that distinguish whether a finite sequence drawn from a finite alphabet is a sample path of a Markov chain of order k = 0 or of order k > 0. Three exact Bayes (EB) classifiers are derived, each corresponding to a different prior. We also include a classifier based on the Bayesian Information Criterion (BIC), a popular technique for Markov order estimation. Using thousands of randomMarkov chains of known order, we test the performance of the classifiers. In both average accuracy and ROC analyses, we find that EB classifiers with informative priors perform better than the BIC classifier, with the difference becoming strikingly large when either the size of the alphabet is large or the length of the sequence is small. We also test the classifiers on five real-world data sets and find that the EB classifications, unlike the BIC classifications, match the orders of the models with highest out-of-sample predictive accuracies.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The consistency of the BIC Markov order estimator

The Bayesian Information Criterion (BIC) estimates the order of a Markov chain (with nite alphabet A) from observation of a sample path x 1 ; x 2 ; : : :; x n , as that value k = ^ k that minimizes the sum of the negative logarithm of the k-th order maximum likelihood and the penalty term jAj k (jAj?1) 2 log n: We show that ^ k equals the correct order of the chain, eventually almost surely as ...

متن کامل

Hidden Markov Random Field Model Selection Criteria Based on Mean Field-Like Approximations

Hidden Markov random fields appear naturally in problems such as image segmentation, where an unknown class assignment has to be estimated from the observations at each pixel. Choosing the probabilistic model that best accounts for the observations is an important first step for the quality of the subsequent estimation and analysis. A commonly used selection criterion is the Bayesian Informatio...

متن کامل

Consistency of the Bic Order Estimator

We announce two results on the problem of estimating the order of a Markov chain from observation of a sample path. First is that the Bayesian Information Criterion (BIC) leads to an almost surely consistent estimator. Second is that the Bayesian minimum description length estimator, of which the BIC estimator is an approximation, fails to be consistent for the uniformly distributed i.i.d. proc...

متن کامل

ToPS: A Framework to Manipulate Probabilistic Models of Sequence Data

Discrete Markovian models can be used to characterize patterns in sequences of values and have many applications in biological sequence analysis, including gene prediction, CpG island detection, alignment, and protein profiling. We present ToPS, a computational framework that can be used to implement different applications in bioinformatics analysis by combining eight kinds of models: (i) indep...

متن کامل

Simulation Results for Markov Model Seletion : AIC, BIC and EDC

Higher order Markov chains, by its very definition, is the most flexible model for finitely dependent sequences of random variables. In practical settings, estimation of the dependency order is needed to identify other model parameters. Based on the penalized log-likelihood function and within nested hypotheses testing framework, several estimation alternatives have been proposed. The AIC, Akai...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011